r, even a methylation site was a DMS, but its host gene was not
ly a DEG.
ose the kth gene was the host gene of the mth DMS and the mth
s ranked at the top in the gth regression model constructed for the
The gene order distance between the gth DEG and the kth gene
ned as below,
|݃െ݇|
econd distance was called the base pair distance. The mean base-
e gth DEG was denoted by ߱ and the methylation site of the mth
e top-ranked DMS for the regression model of the gth DEG) was
by ߱, the base pair distance between them was defined as below,
ห߱െ߱ห
e gene order distances and base distances were binned into six
to study the trend of the distribution pattern. Figure 4.30 shows
order distance distributions of four models. It can be seen that all
butions had the same pattern or trend, i.e., the distributions was
skewed towards the right side favouring to the great gene
. This thus shows that the host genes of most top-ranked DMSs
away from the target DEGs. Note that a target DEG was one of
EGs.
hi-square test validated the severeness or the significance of the
in four models. All p values were extremely small. Therefore, it
doubt that the most important contributors to the differential
n profile in most DEGs were the remote methylation sites rather
l ones. This thus proved the complexity of the genetic-epigenetic
in living cells. For instance, in the Lasso model, 77% (970 of
2E models) of the target DEGs and the methylation sites of the
ed DMSs of these 970 DEGs were separated by more than 1,000
bout 34% (425 of 1,250 M2E models) of the target DEGs and the
on sites of the top-ranked DMSs of these 425 DEGs were
by more than 10,000 genes.